Re: [fw-general] Zend_Pdf requirements

Rick Gigger Sun, 11 Nov 2007 01:36:19 -0800

On Nov 10, 2007, at 9:29 PM, Willie Alberty wrote:

On Nov 10, 2007, at 3:47 PM, Rick Gigger wrote:
A few questions:

1. Are you going to be using xsl-fo for your layout model?
No. XSL-FO does not provide enough fidelity for my needs. The textmodel I am working on is based on the OpenStep (Cocoa) frameworks,and has an attributed string class at its core. Attributes such asfont, color, size, kerning, line height, tab stops, etc. may beapplied to arbitrary ranges of characters within such strings.
...
In the end, Zend_Pdf will have three useful layers for documentlayout: an FO interpreter for multi-page document flows, the lower-level layout classes for drawing attributed strings with finer-grained control, and the existing drawing primitives already inZend_Pdf for precise glyph-by-glyph placement.

I wish I had time to just wait for it to all be completed. :) I haveimplemented something that is not nearly as robust but that more orless handles all of my current needs and is working now. I will beinterested to see what you think as I clean up the code and releasefor public use / comments.

2. Does your revised font code involve support for unicode and/ornon Latin-1 fonts?
The font code in Zend_Pdf can already handle Unicode character sets.All fonts used by the framework, whether the standard 14 PDF fonts,or custom OpenType fonts, extract and use a Unicode character-to-glyph map for drawing.
The reason that Zend_Pdf cannot currently draw non-Latin-1 stringslies in the text encoding method used when drawing the glyphs on thepage. Right now, we've hard-coded using the WinAnsiEncoding methodwhen drawing text (see Zend_Pdf_Resource_Font::encodeString()),primarily because it's a built-in encoding and doesn't require a lotof code to support.
Drawing characters outside the Latin-1 range requires severalchanges to the font program's resource dictionary. At best, a /Differences array must be added, and at worst -- especially for CJKfonts -- a CID font would need to be created. Neither approach isvery easy at this point because of how Zend_Pdf_Resource objectscreate and manage their resource dictionary.

The CID fonts are what I am interested as my main need here is to addJapanese and Chinese support. There is a fairly straightforward wayto do this in FPDF and links to examples on on the front page (http://fpdf.org/). Looking at the font, resource and dictionary code in Zend_Pdf I amhaving trouble seeing where exactly the problem is. Is seems thatadding CID font support to Zend_Pdf wouldn't be much or any harderthan adding it to FPDF. Can you elaborate on what the specificdifficulty is? Perhaps though I just don't understand the code wellenough. I think I might take a crack at adding CID font support herein the near future and see if I can't get it to work. Maybe then itwill become clear to me what the specific difficulty is. Would youlike to see my code if I do?

Let's assume for a moment that CID font support is complete andworking. Now, someone is using it to create a PDF and calls $thisPage->drawText(...) and specifies a UTF-8 encoding for the text. (Rememberthis is the future where CID font support exists and drawText don'tconvert everything down to Latin I anymore.) The text has Chinesecharacters in it. As far as I can see it is still the callers job toknow the nature of the text being added before the call to drawText.If the current font is set to a Latin only font then the Chinesecharacters will show up as jibberish or not at all or an error will bethrown because the current font doesn't have glyphs for Chinese. Sothe caller must know that even though the text is unicode, it containsChinese and so the font must be switched to a font that has theChinese glyphs (a CID font).

Is that your intention or do you plan on actually including inZend_Pdf the ability to just add unicode text and have it pick anappropriate font and switch to it, then convert the text to theencoding of the font and then add the text. It seems like thisfunctionality would make more sense in one of the upper layers thatyou spoke of above. Wherever it goes though the real tricky issue inmy mind is as follows: You have a form that a user of your softwarefills out. Your site handles and stores all text as UTF-8. Somewhereyou've got to decide what language it's in and what font it needs.This is made particularly difficult by ambiguous text (for examplesome Japanese Kanji might look just some Traditional Chinese), andmixed text (lets say you've got some Thai, Japanese, and Korean all inthe same sentence). For the latter problem it seems like you wouldneed to split it up into multiple text runs and switch fonts each timethe language changes. That definitely seems like it belongs in anupper layer. And in that case Zend_Pdf need only add support foradding and using CID fonts while the upper layers handle the rest.

I began the work necessary for full Unicode support in the frameworklast year with the introduction of TrueType fonts. I now have apaying project which will allow me to return to this code andcomplete it. I will have more to report in this area in the comingweeks.

I'll be interested to see what you come up with and to share my workwith you. Are you planning on working on CID font support?

Just curious, if that function works correctly why has such a basicfunction as text measuring not been included in Zend_Pdf if thework has already been done.
All of the primitives exist in the framework, and you'll notice thatZF-313 is over a year old. Anyone who requires this support fortheir particular project can easily do the calculations themselves.The question comes up every two or three months and we generallydirect those people to that function.
The primary reason such a function does not exist in the frameworkright now is support. The function I wrote for ZF-313 works, butdoes not do the calculation in the most efficient way possible,particularly if you're trying to use it to determine where to breaka line (it's a Shlemiel the painter's algorithm -- http://www.joelonsoftware.com/articles/fog0000000319.html). Furthermore, once a function appears in the framework, there arebackwards-compatibility issues that must be considered.
Since 95% of the people who ask the question, "How do I measure aline of text?" are really asking, "How do I wrap the long lines of aparagraph?" or "How do I center this title?", creating andsupporting a string measurement function is the wrong problem forthe framework to solve.
Instead, the framework should provide layout classes that make itdrop-dead simple to wrap long lines of text and center titles on apage. This is what I've been focusing on. My goal is to not have tocreate a text measurement function for the framework because thelayout classes provided make it unnecessary.

Yes, now that I look at it more closely and think about my textwrapping algorithm it will end up being MUCH faster to use those lowerlevel functions. Thanks for the explanation.

Well now I've got to decide on how to prioritize. I want to keepmaking progress in cleaning up and publishing my XML/DOM code but theunicode / CID font work is much more pressing for my business concerns(ie it's paid). So I think I'll try working on both for a while andsee what happens.

Thanks again for all the info, I hope our current work overlaps so wecan help each other out.


Rick

Re: [fw-general] Zend_Pdf requirements

Reply via email to