PdfSmartCopy only shares the streams. Although there are still many fonts they
all share the same font data.
Paulo
----- Original Message -----
From: Jason Berk
To: [email protected]
Sent: Friday, December 23, 2011 3:41 PM
Subject: [iText-questions] PdfSmartCopy not reusing fonts
Using iText 5.1.2
I have created a number of "statements" using iText. The only
non-Helvetica font is an OCR font that has 10 glyphs: 0 thru 9
I have a Font class like so:
public class Fonts {
private static final String OCR_FONT_FILE = "/OCRAEXT.ttf";
//stored in my jar
static {
BaseFont _ocr = null;
try {
_ocr = BaseFont.createFont(OCR_FONT_FILE,
BaseFont.WINANSI,
BaseFont.EMBEDDED);
_ocr.setSubset(false); // do NOT use subsets
} catch (Exception e) {
throw new RuntimeException("failed to load OCR
font from "
+ OCR_FONT_FILE);
}
OCR = new Font(_ocr, 11);
}
public static final Font OCR;
private Fonts() {
// this is a utility class...do not instantiate it!
}
}
My understanding was that, by default, iText will subset a font if the
entire font isn't used. Knowing in the end, I'm going to concatenate
all my pdfs into a single pdf (using smartcopy), specify
setSubset(false). I realize this will increase the size of each
individual file (but only slightly...there's only 10 glyphs and 6.4KB).
My goal was that when I concat my statements together, smartcopy would
reuse the OCR font and not embed it repeatedly. Each PDF also has the
same hi quality graphic (logo). When I run this code:
FileOutputStream stream = new FileOutputStream(OUT);
Document doc = new Document(PageSize.LETTER);
PdfSmartCopy copy = new PdfSmartCopy(doc, stream);
copy.setFullCompression();
doc.open();
File[] statements = IN.listFiles();
for (File statement : statements) {
String bookmark = statement.getName().replace(".pdf", "");
PdfReader reader = new PdfReader(statement.getAbsolutePath());
int pages = reader.getNumberOfPages();
for (int p = 1; p <= pages; p++) {
PdfImportedPage page = copy.getImportedPage(reader, p);
if (p == 1) {
new PdfOutline(copy.getRootOutline(), new
PdfDestination(PdfDestination.FITBV, page.getHeight()), bookmark);
}
copy.addPage(page);
}
reader.close();
copy.freeReader(reader);
}
doc.close();
copy.close();
stream.flush();
stream.close();
my resulting PDF still has a ton of fonts. Most of which are not
embedded because they are Helvetica. The OCR font though shows as
embedded repeatedly. See the attached screen shot.
The API for smartcopy
(http://api.itextpdf.com/itext/com/itextpdf/text/pdf/PdfSmartCopy.html)
say:
"PdfSmartCopy has the same functionality as PdfCopy, but when resources
(such as fonts, images,...) are encountered, a reference to these
resources is saved in a cache, so that they can be reused. This requires
more memory, but reduces the file size of the resulting PDF document."
But that's not the behavior I'm seeing.
After reading this:
http://itext-general.2136553.n4.nabble.com/How-to-remove-embedded-fonts-
from-a-pdf-document-td2166185.html
I suspect it might have something to do with the font being a true type
font?
Just to be sure I ran the resulting merged PDF thru smartcopy a second
time (basically just making a copy). The font properties did not
change, but the file did shrink by 20KB (which is insignificant given
the PDF is 10+ MB).
I'm about to run this process on a data set that's MUCH larger. The
resulting PDF will be closer to 200MB. I'm curious how much of that is
the duplicate font and what if anything I'm doing wrong as to why it's
not being referenced out by smartcopy.
Thanks for any advice.
Jason Berk
[email protected]
This is a transmission from Purdue Federal Credit Union (Purdue Federal) and
is intended solely for its authorized recipient(s), and may contain information
that is confidential and or legally privileged. If you are not an addressee, or
the employee or agent responsible for delivering it to an addressee, you are
hereby notified that any use, dissemination, distribution, publication or
copying of the information contained in this email is strictly prohibited. If
you have received this transmission in error, please notify us by telephoning
(765)497-3328 or returning the email. You are then instructed to delete the
information from your computer. Thank you for your cooperation.
------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php