Re: TrueType Font Embedding

Jeremias Maerki Fri, 12 Nov 2010 00:32:16 -0800

On 11.11.2010 22:10:57 Eric Douglas wrote:
> If using installed fonts is an option to save space in the file / data
> stream, using embedded fonts still needs to be an option.


Eric, we're not talking about removing anything. We're talking about
adding TrueType support to PostScript output and handling referenced
TrueType fonts with possibly full Unicode support.

> I am assigning specific fonts from specific files to get consistent
> output so everything must be embedded.  I don't want to have to care
> what is installed where.  I am glad to fix this headache I've had with
> Windows 98 trying to use Courier New fonts and different PCs with the
> same OS had a different font file, and trying to render on the server
> versus the client having one not installed or different fonts installed
> with the same name.
> 
> The problem I'm currently having with output is rendering special
> unicode glyphs.  I sent one unicode as a 25AB with the font file
> LTYPE.TTF which came installed with Windows XP.  In FOP 0.95 it produced
> a square which is what I want.  That character is supposed to be a
> square.  If I'm wrong and that character is not in the font then the
> square was the default print for character not found.  I'd like to be
> able to run a routine through FOP to get out a list of all unicodes and
> what characters they go with for a particular font.  When I tried FOP
> 1.0, that same code produced a pound #.

Hmm, sounds like a regression. I guess we'll have to look into that then.
And such a glyph dump utility is definitely something FOP could profit
from. Has anybody already written something like that? We could
integrate it into org.apache.fop.tools.fontlist maybe.

> The biggest problem I'm having running FOP 0.95 is the threading.  I've
> tried calling it from a Java SwingWorker and it's not resolving the
> issue.  I'm running a javax.swing.JProgressBar as indeterminate and it
> freezes while I'm transforming FOP output, so the users think the
> program is just stuck and I have to explain to them it's supposed to do
> that the first time.  If they run it twice in a row the second one is
> much smoother.

I've never used FOP in a way that it interacts with a Swing GUI. Maybe
there's some interaction with AWT/Java2D since FOP uses Java2D
extensively depending on the output format. But it makes absolutely
sense to run FOP in a different thread than AWT's event loop.

> Getting smaller results is nice but not necessarily a priority.
> Reducing a 2 MB file to 35 K is high priority.  Reducing a 46 K file to
> 35 K is not a big deal.  Getting consistent output is top priority.
> 
> 
> -----Original Message-----
> From: Jeremias Maerki [mailto:[email protected]] 
> Sent: Thursday, November 11, 2010 3:35 PM
> To: [email protected]
> Subject: Re: TrueType Font Embedding
> 
> Hi Chris
> 
> I fully understand the desire to install the font on a PostScript
> printer to keep the PS files smaller. To answer your question: I did not
> ask for the business use case. The problem I'm struggling with in this
> context is how to know about the CID meaning of the font, i.e. the
> multi-byte encoding of the font.
> 
> When we do subsets in FOP, we re-index the glyphs starting with index 1
> (or 3) by occurrence in the document. Only FOP knows which Unicode
> character is represented by which CID. That's why we need the ToUnicode
> CMap in PDF. Otherwise, text extraction would not be so easy.
> 
> In single-byte mode, the whole font is embedded (right now probably with
> the same problems I've just fixed with rev1034094 for the TTF subset).
> In this mode the Adobe character names map into the font, so 8-bit
> encodings can be built to properly address the right characters even if
> the font is not embedded. That's also how we currently do referenced TTF
> fonts for PDF output.
> 
> If we fully embed the font as a CID font, we currently lose the
> knowledge about which index represents which Unicode character.
> Combining the font with a suitable CMap resolves the problem but at the
> moment we only use Identity-H which is a 1:1 mapping. One solution would
> be to turn the Unicode "cmap" table in the TrueType font into a custom
> PS CMap and then use 16-bit Unicode characters directly. FOP currently
> doesn't support that.
> 
> Also, if some PS platform allows to upload naked TrueType fonts, how
> will they be represented in the PS VM? Are they CID fonts then or
> single-byte fonts? If they are CID fonts, which CID system are they
> following? I have no idea. The only way to be sure about this is by
> installing a CID font plus CMap that is generated by FOP (which can be
> done by extracting these resources from one of the PS streams. After
> that, the font can be referenced, but it may not be portable to other
> PS-generating applications.
> 
> And then, as Glen mentioned we have to have a strategy to deal with
> glyphs with no representation in Unicode. I think I get where he goes
> with that and it seems to be close to the CMap I mentioned above that is
> derived from the Unicode "cmap" table in the TrueType font. At any rate,
> FOP then has to learn to output Unicode characters (including private
> area chars) instead of arbitrary CIDs coming from subsetting.
> 
> In the end, I'm not 100% I've understood all implications here. I hope
> we'll get there soon. I guess a Wiki page would do us good here.
> 
> 
> 
> Jeremias Maerki
> 




Jeremias Maerki

Re: TrueType Font Embedding

Reply via email to