I've been asked to look into the possibility to support unusual
encodings (like Cyrillic) with Type 1 fonts. Right now we only support
WinAnsiEncoding (plus special handling for Symbol and ZapfDingbats).

I already have an AFM parser. The AFM parser is the precondition to
safely support non-standard encodings as only this file contains the
glyph list of a font.

I'm now on a good way to support non-WinAnsi encodings since I can now
build CodePointMapping instances from an AFM file. I then have to teach
the PDF and PS renderers to make use of these special encodings.

That's step 1, but it will only make the font's native encoding
available in FOP. The number of available glyphs for a Type 1 font will
still remain under 255 (typicaly under 223 as the first 32 chars are
usually not used). To support all glyphs of a Type 1 font we need more
and I found two possible ways to pursue:

1. Treat Type 1 fonts as CID fonts.

+ Probably the cleaner approach.
+ All glyphs are supported under one single font (no font renderer-level
  font switching required, see below)
- Makes the generated PDF/PS code a little less readable but that's not

2. Do something like OpenOffice when handling fonts with more than 255
chars: Create multiple single-byte encodings which map to the same base
font. This will require an 1:n relationship from font to char mapping
which the renderers also have to handle. The first encoding will be
equal to the font's default encoding (PDF calls that the "implicit base
encoding"). The other encoding(s) will be built from the rest of the
available characters. In the renderer it will be necessary to switch
fonts from one character to another (not the same as switching from
Helvetica to Symbol, i.e. not at FO level, but at renderer level).

+ Higher compatibility with PDF viewers which are not yet
+ Keeps the generated PDF/PS code more readable (not important)
- Switching between derived fonts (i.e. font with a common base font but
  with special encodings) is necessary. SingleByteFont needs to be split
  in two classes.

An example: The "Baskerville Cyrillic" font contains 264
characters/glyphs. The default encoding only contains 221 characters. So
43 additional characters can be made available like this.

I'm currently leaning towards CID fonts as it is probably the cleaner
approach. Both solutions are probably pretty much the same in terms of
effort. The CID approach will take more work in the PS renderer and the
multi-encoding approach will make changes necessary in FOP's font

If anyone has thoughts on this, I'd appreciate it. I'll finish the
changes for supporting the default encodings and then finish the
processing feedback stuff before I finish this here.

Jeremias Maerki

Reply via email to