Hi Mason, On Mon, Sep 28, 2009 at 7:40 PM, Mason James <mason.loves.su...@gmail.com> wrote: > > so - i'm curious... is there a newer/better way to get around the > less-than-perfect character-conversion issues with UTF to PDF, that were > discussed on the lists in the last year or so (approx)
The UTF to PDF conversion issue appears to be primarily caused by the fact that the PDF stream uses glyphIDs rather than unicode to display strings. Thus there is not a direct, one-to-one unicode-gliphID relationship. The reason that *some* unicode chars come across ok is more ascribable to chance than to design. This happens when the unicode *happens* to match the font gliphID. What really should be happening is that there should be a "ToUnicode" table built and embedded in the PDF file so that the relationship from unicode to gliphID may be properly defined. Logically, the next question is: How is this to be accomplished? The answer is: I have no concrete idea atm. I *think* that the first issue at hand is that the "standard 14 fonts" do not extend far enough into the unicode char set to be usable afaict. So we will need to use fonts which do. (ie. gnu freefonts http://www.gnu.org/software/freefont/) The second issue is to understand how ISO32000-1 defines building a ToUnicode CMap (sect 9.10.3) and grind out some code to construct these (probably more modifications to PDF::Reuse: I have made a number already which the maintainer has agreed to include in the next release toward the end of October). It may be as simple as embedding unicode ttf's in the PDF file. If that is the case, the code for that is already in place in both PDF::Reuse and PDF::API2. I'm not convinced that the solution is anywhere near that simple or it would have been done by now. But this is all somewhat subject to sudden and dramatic change as I'm still very much on the learning PDF learning curve and could be way off target. I have had some correspondence with an individual who is a platform architect at Adobe and who has kindly offered to help clarify any questions regarding unicode and PDF. Any thoughts, information, suggestions, etc. is most gratefully appreciated. Kind Regards, Chris _______________________________________________ Koha-devel mailing list Koha-devel@lists.koha.org http://lists.koha.org/mailman/listinfo/koha-devel