On 07/03/14 12:23, Robert wrote:
About a week ago I posted a patch to add Type 1 subset support to FOP. All referenced
Type 1 fonts (unless set to embedding-mode="full") will now be subset by
default much like the behaviour exhibited by TrueType and OpenType. As this is a big
feature and quite involved I think it is necessary to vote on whether to add this feature
in it's current state to FOP. I'm not sure if anyone has taken a look at what has gone
into this or tried it out yet, but it might be worth doing so before making your decision.
I am going to be away for the next week or so but will tally up the votes and
post the result once I am back.
Here is a link to the patch and issue:
From the quick look I had at the patch, I must say that some things are
sources of concern to me:
• The PostScript parser seems to be mixing lexical analysis, syntax
analysis and interpretation. This makes it hard to follow and I could
not figure out the meanings of the conditions in the various ‘if’
statements inside the ‘parse’ method. Also, part of the parsing seems
to be leaking into Type1SubsetFile. I’m concerned about the robustness
of the thing. For example, there are unguarded calls to
Integer.parseInt. How tolerant will that be to malformed font files?
• It seems that Type1SubsetFile tries to infer the mapping of character
codes to glyph names. That essentially re-does what the mapChar method
has already done earlier, with probable mismatch between the outputs
of the two methods. In Type1SubsetFile.readEncoding I see references
to the WinAnsi encoding, which may have nothing to do at all with the
font’s own encoding. I suspect this is the source of the exception
thrown when running the FO I attached to the issue.
• there is a lot of memory allocation. First, the font is entirely
loaded in memory in Type1SubsetFile.createSubset, then again in
PFBParser, plus data copied around when creating the subset. Surely
some of this memory allocation can be avoided. Have you profiled the
code? How much more slow is it compared to fully embedding the font?
Due to the possible regressions and the potential impact on performance,
I must vote -1 against enabling Type 1 subsetting by default. If Type 1
subsetting is left as an option that can be manually configured by the
user, then I vote +0.