Rick Gigger wrote:
Yes, I was aware of those. I don't use them because I implemented my
own xml based solutions that has support for tables, page breaking,
columns, full rich text support, etc. The basic support in those two
engines was never enough. All I really need are the primitives to
measure, style and draw text, and of course primitive line and shape
drawing operations. I have an abstraction layer that lets me swap
back and forth between R&OS and FPDF. My XML parser / DOM style
rendering library handles the rest. So I don't think it would be too
hard to use Zend_Pdf instead.
That sounds really interesting... I can see why switching to Zend_Pdf
wouldn't be too hard at that point.
The PDF renderer needs to map the text it is trying to render to an
available font. Unfortunately none of the standard PDF fonts are
unicode fonts. So if you want to be able to actually put unicode text
in and have it understood then you have to (if I am reading the spec
correctly) embed the unicode font right into the PDF . Licensing
issues aside entire unicode fonts are about 10-20 Mb. Most of the
PDFs I generate are under 50 Kb so that's certainly not going to
work. So what I am left with is to determine the language of each
block of text and convert it to a local encoding that the PDF spec
will happily accept. That seems a little tricky.
I'm no PDF expert, but for some reason I thought you could embed partial
fonts. Meaning, you would only have to embed the glyphs you actually
used in the document. Seemingly, you could do that with the Unicode font
to drastically bring down the size requirement.
Say you have a small segment of Japanese with only a few Kanji
characters. How do I distinguish that from a few characters of
Traditional Chinese? Lets say I figure that part out but then they
have say Korean mixed into the same text with Japanese. I have to
split out the Korean from the Japanese and draw them separately each
time converting to a local encoding and indicating the correct font to
use.
Anyway that is the problem as I see it. Once I get this figured out I
would happily contribute any or all of it to Zend_Pdf if the author
wants it. I have this hope that someone working on the Zend_Pdf is
going to say that I've read the spec all wrong and that I can somehow
add unicode text directly to the pdf and have the PDF reader map it
all to a unicode font if present or into the various non-unicode local
fonts if if it's not. And of course if whoever is working on the
UTF-8 support for Zend_Pdf figures it all out and implements it I'll
be more than happy to just switch my rendering engine to just use
Zend_Pdf for primitives and get the unicode support for free.
That definitely sounds like a tricky problem. I haven't read all the PDF
spec, nor have I worked much with the Zend_Pdf code, so I unfortunately
don't have any answers, here. I suppose you could just build a PDF file
with all those characters, and see if it comes out as gibberish.
Regards,
Bryce Lohr