Re: Writing PDF Documents and other source code parts

Vincent Hennebert Fri, 02 Oct 2009 03:15:39 -0700

Hi Alexander,

I can’t really help I’m afraid, as I personally don’t have the necessary
knowledge. It’s probably time to submit what you already have as a patch
attached to a Bugzilla entry:
https://issues.apache.org/bugzilla/enter_bug.cgi?product=Fop
That will allow us to have a look and maybe provide some additional
guidance.


How feasible would it be to write a thin layer on top of your library
that would bridge the gap between it and the current one? That would be
a temporary layer until the PDF code is in turn refactored, allowing you
to keep the new library clean (do we really want write support for
OpenType files??). Refactoring the PDF code now will lead you too far.
Keep concentrated on fonts (as much as possible) for now.

BTW, have you submitted your ICLA? 155 new classes... We’re gonna need
one :-)

Thanks,
Vincent


Alexander Kiel wrote:
> Hi,
> 
> I know my goal is to implement basic OpenType support for FOP. But from
> font subsetting/embedding my eyes touched the actual PDF output
> routines.
> 
> I think, that this module needs refactoring. If you have a look at the
> PDFWritable interface, there is a really ugly method. The method
> outputInline takes an OutputStream and a Writer, which are related to
> each other. The comment says, that the writer is buffered and every time
> out want to write something to the OutputStream, you have to flush the
> Writer first. Thats crude.
> 
> What is really needed is some output interface which is able to do both,
> write chars and write bytes.
> 
> I had also a look at PDFBox regarding writing PDF's. Maybe we shouldn't
> refactor FOP's own, maybe a bit legacy PDF code. But I don't like PDFBox
> code either.
> 
> So I'm a bit helpless now. The problem is, regardless of what code I
> see, let it be:
> 
> TTFSubSetFile 
> 
>     Which is all about, reading a TrueType file, taking account of
>     some glyph mapping (the glyphs used) and returning a byte array,
>     which contains the bytes of a TrueType file with the subset of
>     glyphs. This thing extends TTFFile which is about representing a
>     TrueType file mixed with all the reading stuff. Here, reading,
>     writing and representing some real world object is mixed in a
>     really ugly way.
> 
> PDFFactory
> 
>     This class does two things: creating and registering PDF objects.
>     A factory should only create objects. Than this class has nearly
>     1800 lines of code. Maybe it is a factory of to much things?
> 
>     If I look at the method which interests me "makeFontFile" the
>     comment says: "Embeds a font.", but the method name is
>     "makeFontFile". "makeFontFile" makes sense in a factory. But
>     "Embeds a font." hints that this created font file is actually
>     embedded in the PDF document. Than this method has nearly 100
>     lines of code, which does all sorts of things that I can't
>     understand fast. In some line the TTFSubSetFile is created and
>     the resulting bytes go into some PDFTTFStream - okay.
> 
>     So do not wonder about memory problems. Here you have whole
>     300 kb+ fonts sitting in arrays.
> 
> MultiByteFont
> 
>     It seems to me that the MuliByteFont tracks the glyph usage. 
>     "getUsedGlyphs", "mapChar", "subSet". I always thought that
>     fonts are immutable objects, representing a font program which
>     can be used shared all over the application. Enjoy building
>     a common font source in FOP!
> 
> I don't know how I should integrate my own code into it. I think here is
> a lot of refactoring necessary in order to get the FOP parts into some
> state here I can integrate new code. 
> 
> But I'm not sure where to start, not sure if here are enough tests. I
> don't know the overall structure. I'm simply a bit helpless.
> 
> I have a nice fonts.opentype package here with 155 classes and 279 tests
> covering 93 % of the classes and 80 % of the lines. I can already read
> all of the TrueType metrics and OpenType kerning info. I have a class of
> every entity of the OpenType spec and a Reader for every such class.
> That means you can test reading every substructure alone. I think that
> this is a really nice API for reading OpenType files.
> 
> So now as I saw what TTFSubSetFile really has to do, I will start adding
> write support for OpenType files. Than I will write some manipulation
> routine which can build a subset of a file. But I don't like so get the
> glyph mapping info for this manipulation from a MultiByteFont which
> should be really immutable.
> 
> I found it sufficient to write a KerningMapBuilder which stuffs kerning
> pairs into a really nice double nested Map construction. As the comment
> on CustomFont#replaceKerningMap says:
> 
>     the kerning map (Map<Integer, Map<Integer, Integer>, 
>     the integers are character codes)
> 
> Such a high specialized, self explaining, problem-oriented data
> structure is spread all over the font system. Know your tools!
> 
> So where to start?
> 
> Best Regards
> Alex
>

Re: Writing PDF Documents and other source code parts

Reply via email to