Re: [iText-questions] contribution: FontReplacingPdfSmartCopy: duplicate TTF font subset merging and replacement (was: How to remove embedded fonts from a pdf document)

Lari Hotari Thu, 31 Jan 2008 08:28:38 -0800


Leonard Rosenthol wrote:
> 
> So this will only work if the FONT dictionary specifies a /Subtype  
> of /TrueType, an /Encoding of /WinANSIEncoding and does not have a / 
> Differences array - correct?
> 
> There is no support for Type 1, Type 1C, Mac encodings, for custom  
> encodings or for CID fonts, correct?
> 
> Also, how do you determine in the case of multiple subsets that the  
> fonts were from the same font originally?  Only by /BaseFont name?
>

I started the work as proof-of-concept. My usecase was only for TrueType
fonts so I haven't tried too much more.

I also looked at replacing other type of encoding like Identity-H for
barcode fonts but they use a custom encoding in the page stream and that
would be too much effort to start merging the encodings. It would also
require parsing & modifying the page stream.

The font's are only recognized by the /BaseFont name, there isn't any other
checking done. The FirstChar and LastChar ranges are updated in the final
font (the minimum FirstChar and maximum LastChar is selected for the final
font).

Subset fonts have a name like "ZIGEYT+ComicSansMS". The subset fonts have
their real postscript fontname after the "+" sign.

PDF reference , p. 419
"For a font subset, the PostScript name of the font—the value of the font’s
BaseFont entry and the font descriptor’s FontName entry—begins with a
tagfollowed by a plus sign (+). The tag consists of exactly six uppercase
letters; the choice of letters is arbitrary, but different subsets in the
same PDF file must have different tags. For example, EOODIA+Poetica is the
name of a subset of Poetica®, a Type 1 font. (See implementation note 63 in
Appendix H.)"

If you look at the source code you can see that the implementation is fairly
light weight currently. It just extends PdfSmartCopy with font
replacement/merging possiblity. It uses existing methods to write fonts etc.
(copy&paste from other itext classes in some places).
There's also a jUnit testcase is the zip, under test subdirectory.
It uses the PdfConcator helper class that makes it easy to configure a
PdfConcator bean instance in IoC/DI (Spring Framework, Guice, etc.).

I hope that this work could serve as a baseline for adding font merging and
replacement features to iText.

It would be nice to have some kind of template method pattern or strategy
pattern for customizing the base solution for different use cases.

Lari
--
View this message in context:
http://www.nabble.com/How-to-remove-embedded-fonts-from-a-pdf-document-tp14033717p15207863.html
Sent from the iText - General mailing list archive at Nabble.com.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://itext.ugent.be/itext-in-action/

Re: [iText-questions] contribution: FontReplacingPdfSmartCopy: duplicate TTF font subset merging and replacement (was: How to remove embedded fonts from a pdf document)

Reply via email to