You have real Unicode support in iText but PDF and IE are two completely
different things. IE has access to all the fonts known to Windows but iText
have the java limitation of having to work in several platforms.
Only TT fonts support Identity-H and the built in fonts only have Latin1
characters.
If you want to automagically select the font based on the characters see the
example font_selector.java at itextpdf.sf.net.

Best Regards,
Paulo Soares

----- Original Message ----- 
From: "Nikolaj Brinch Joergensen" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, April 02, 2004 17:28
Subject: [iText-questions] Unicode support in PDF (IE can do this)


Hello,

We have a problem generating PDF and getting the right display of Unicode
texts.

We receive XML as the source of what we need to output as PDF, the XML is in
UTF-8 encoding (Unicode).
Furthermore we receive data from a database that is also completely Unicode.
In the XML and the database all kinds of different text can be stored, with
all kinds of locales (Latin1, Latin2, Japanese etc.), at the same time.
It works great just producing HTML, as you just specify UTF-8 as the
encoding of the HTML file, and IE (or Mozilla or Netscape) will
automatically detect this and choose the proper fonts etc. display for the
right characters.

An example is a text in mized Japanese and English, where the font is MS
Trebuchet.
This works great on IE that defects that the Japanese is Japanese and
displays it this was, and display the English text using Trebuchet (I guess
the Japanese text is displayed using Arial Unicode MS).
Furthermore all the Polish text on the same HTML page is displayed
correctly.

I've read all the posting on this site about setting encoding to IDENTITY-H
(blowing up the size of my PDF), but this will not work!
First of all an UnsupportedEncodingException is thrown for Helvetica (the 14
built in fonts, as Identity-H is not a supported encoding for this font),
even though that if you catch Exception (UnsupportedEncodingException is not
surfaced) you will get back Arial and output will be produced (for English
text that is).
Second of all, when you ask iText to output a mixed text (Japanese/English)
in a chunk/phrase/paragraph, whatever, using MS Trebuchet, what you will get
is the Latin1 characters /the English characters), as the other characters
are not in the font.

My main concern is that what my users would like is that something is at
least displayed. That the data output has higher priority than the visual
appearance (no data integrity failures).

My other main concern, followed from this, is how to do the
clever/intelligent font mapping, that IE is apparently able to do (can you
actually run through a Unicode string and figure out that this character is
Japanese and this is regular Latin1, this is Latin2 etc.?).
This would mean that a possibility to actually have real Unicode support in
iText.
The way it is now, that you actually have to hardcode, or set encoding when
you create fonts, is in my opinion not transparent text handling with
regards to locales/encodings, and as that not real Unicode support. It makes
me think of the good old DOS days where we had to set up the correct
codepages/encodings to get the right national characters on the screen.

If I could get some pointers to what to do, and how this font mapping could
be handled, I would be happy to contribute this to the library when it
works.

Sorry if I'm bitching but this is really nagging me :-)

Thanks,
NEKO


SAS Forum International Copenhagen 2004 - Bella Centret, 15.-17. juni

Tilmeld dig nu: http://www.sas.com/dk/sasforum




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id70&alloc_id638&op=ick
_______________________________________________
iText-questions mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/itext-questions



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
iText-questions mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Reply via email to